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Abstract — We consider a simple network, where a source and 
destination node are connected with a line of erasure channels. 
It is well known that in order to achieve the min-cut capacity, 
the intermediate nodes are required to process the information. 
We propose coding schemes for this setting, and discuss each 
scheme in terms of complexity, delay, achievable rate, memory 
requirement, and adaptability to unknown channel parameters. 
We also briefly discuss how these schemes can be extended to 
more general networks. 

I. Introduction 

Networked systems arise in various contexts such as the 
public internet peer-to-peer networks, ad-hoc wireless net- 
works, and sensor networks. Such systems are becoming 
central to our everyday life. The networked systems today 
employ traditional coding schemes for end-to-end connections 
and are generally not tailored to the network environment. 
For example, for reasons of design simplicity, intermediate 
nodes at a network are only allowed to forward and not to 
process incoming information flows. However, as the size of 
communication networks grows, it becomes less clear if the 
benefits of the simple end-to-end approach outweigh those of 
coding schemes that employ intermediate node processing. 

From a theoretical point of view it is well-known that 
if intermediate nodes are allowed to decode and re-encode 
the information sent by the source, -with no constraints 
on complexity and/or delay,- then the information capacity 
between a sender and a receiver is upper bounded by the min- 
cut capacity of the network, as described in [2]. A crucial 
point in making schemes that employ intermediate node- 
processing practical and attractive, is in realizing benefits 
without incurring excessive complexity and delay. 

In this paper we propose coding schemes that employ 
intermediate node processing and discuss their performance. 
These schemes are based on fountain codes, a set of rate- 
less codes recently proposed [4], [7] that have a number of 
desirable properties for networked environments. We compare 
different coding schemes based on their complexity, delay, 
memory requirement, achievable rate, and adaptability; we 
will define these metrics precisely in Section HTl 

For example, if we use an LT-code [4] to encode k informa- 
tion bits at the source and simply forward any received bit at 
the intermediate nodes, we would need 0(k\og(k) / C) XOR 
operations at the transmitter, and 0(k log(fc)) XOR operations 
at the receiver, where C is the end-to-end capacity of the 
overall channel measured in bits per channel use. Intermediate 



nodes would have no processing or memory requirements, and 
would not introduce delay. This scheme would further adapt to 
unknown channel parameters. However, the achievable rate can 
only approach the end-to-end capacity of the overall channel, 
which is in general less than the min-cut capacity of the 
network. 

In [5] the authors examined the benefits of intermediate 
node processing from an information theoretic point of view. 
Our work can be viewed as approaching the same problem 
from a coding theory point of view. 

In [8] a scheme was proposed that takes advantage of inter- 
mediate node processing to approach the min-cut capacity, and 
puts emphasis on the queuing theory aspects of the problem. 
The authors show that if we allow intermediate nodes to 
transmit random linear combinations of the incoming packets 
over a finite field GF(q), the transmission rate approaches the 
min-cut capacity as q goes to infinity. In this paper we will 
present alternative optimal coding schemes that approach the 
min-cut capacity using a constant field size, and in particular 
a binary field. 

The paper is organized as follows. In Section |H] we 
present our model and performance metrics in more detail. 
In Section [ill] we describe our proposed coding schemes. In 
Section IIVI we discuss generalization to other networks; In 
Section [V] we compare our results with some related work in 
more details, and finally we conclude the paper in Section fVll 

II. Model 

We consider a linear network that models a path between 
a source and a destination. The corresponding graph is com- 
prised of a source node, a destination node and a series of 
L — 1 intermediate nodes. The L edges between the nodes 
correspond to independent memoryless erasure channels, and 
the information units sent over the ith link are erased with 
probability e,. 

We assume a discrete time model, where each node can 
transmit one unit of information at each time slot. For coding 
purposes, we will treat each information unit as a symbol, 
but in general we can have a packet of symbols, and apply 
to each symbol of the packet the same encoding/decoding 
operation; in the following, we will refer to information units 
as packets or symbols interchangeably. Intermediate nodes 
have the capability to process the packets they receive, and 
use them to generate new packets. We ignore the transmission 
delay along channels (as it is beyond our control), i.e., we 



Source Destination 

Fig. 1. A path between a source A and a receiver C with L=2 links. 

assume that a packet transmitted at time d, if not erased, is 
received immediately at the next node in the chain. 

Throughout this paper we will use as illustrating example 
the simple configuration depicted in Fig. ^ with L = 2 links; 
we will also discuss the generalization of our results to longer 
chains. The source node A encodes k symbols to create m 
coded outputs using a code Ci and sends them over the 
channel AB. Node B will receive on average m(l — ei) coded 
symbols over n\ time slots. Node B will send n<x packets, 
using a code (more generally, processing) C2- If node B 
finishes transmitting at time d, where max{ni,7i2} < d < 
n\ + i%2, then node C will receive on average 712(1 — £2) 
packets after d time slots. For each coding scheme of this 
type, we define the following metrics: 

1) Complexity for encoding/processing/decoding at nodes 
A, B and C: the number of operations required as a 
function of k, ni and 712. 

2) Delay incurred at the intermediate node B: this is the 
time (d — k/C mc ), where C mc is the min-cut capacity. 
We will remark more on this notion of delay in Sec- 
tion E^X] below. 

3) Memory requirement: the number of memory elements 
needed at node B. Section III-AI will also comment on 
the minimal memory requirements of any coding scheme 
over the line network. 

4) Achievable rate: the rate at which information is trans- 
mitted from A to C. We say that a coding scheme is 
optimal in rate, if each individual link is used at a rate 
equal to its capacity. Thus it can achieve the min-cut 
capacity between the source and the destination. 

5) Adaptability: whether the coding scheme needs to be 
designed for specific erasure probabilities e\ and £2 or 
not. Fountain codes, for example, are adaptable in this 
sense. 

We observe that, although it is possible to design a code 
over a single link that is both adaptable and is optimized for 
achievable rate and delay, the overall coding scheme cannot 
be adaptable if we want to jointly optimize for achievable rate 
and delay. Indeed, assume that £2 = 0. Then the scheme that 
jointly optimizes the delay and the achievable rate requires 
node B to transmit (forward) only when it receives a new 
packet. However, if t\ and £2 are equal and large, then a large 
fraction of the packets will get erased. In order to optimize 
for delay, node B should transmit about packets for 

each packet it receives, without waiting to receive the next 
packet from node A. Therefore a single scheme cannot be 
rate-optimal for both cases. 

Depending on the application, different emphases might be 



placed on these performance metrics. For example, consider 
a real-time application, where information is collected into 
blocks of k packets that are encoded and sent over the channel. 
In other words, we want to transmit the real-time information 
from a source, as it is produced. Assume that we have M 
such blocks. Then the delay overhead at intermediate nodes 
can be considered to be a "set-up" delay for the connection, 
experienced only once, and hence insignificant if M is large. 
On the other hand, the memory requirements at intermediate 
nodes may be restrictive. Indeed, there might exist a large 
number of connections (paths) that share an intermediate node 
that performs processing. Thus, the memory available for 
each individual connection might need to be scaled down 
accordingly. 

A. Optimal Delay and Memory Requirements 

Recall that our notion of delay is linked with the optimal 
time of communication over a single channel with equivalent 
min-cut capacity. Note however that with this definition, it 
is impossible to achieve a 'zero delay' scheme even for the 
simple network of Fig. Q In fact, even if both links AB and 
BC provide perfect feedback, there is an inherent delay to 
be suffered due to the existence of sequential links. As we 
will see, even in this perfect setting, there is also a need for 
memory storage, in amounts that grow with k. In this section 
we will calculate the memory requirements, as well as the 
minimal delay which is incurred when perfect feedback exists; 
certainly no coding scheme that does not rely on feedback can 
transmit in less time. 

The obvious optimal scheme in the presence of feedback 
is one where each node repeats transmission of each packet 
until it is successfully received at the destination. Node A then 
completes transmitting in time n w fc/(l — e). The operations 
at node B can be described using a Markov chain with states 
Xi G {0, 1, 2, ■ • ■}, indicating the number of received packets 
still to be sent at each time; therefore at each time i, Xi packets 
need to be stored in memory. At each time (when Xi ^ 0), 
with a probability 1 — 2e(l — e) the state is unchanged, and 
with a probability 2e(l — e), the state is increased or decreased 
by 1, with equal probability. Therefore, after n time slots, the 
dynamics of this system resembles that of a random walk with 
a reflecting boundary at 0, over n' = 2e(l — e)n steps; (there is 
slight correction, due to 'longer stays' at state 0, but for large 
n, the probability of being at that state is insignificant.) Thus 
the expected value of x n is the expected value of the absolute 
value of a random walk after n' steps. Therefore E[a;„] = 
O(Vn') = 0(\/2ek), where we have used that n ~ fc/(l — e). 
Node B then completes transmitting the remaining x n packets 
in a time d f» x n /(l — e). Therefore, the 'delay' of this scheme 
is 0(Vek/(l — e)), while the expected memory requirement 
is 0(Vek). 

This argument can be extended to show that in linear 
network with L similar links, where L is a fixed finite number, 
each intermediate node incurs a delay of 0(Vek/ (1 — e)) and 
requires 0{\fek) units of memory. 



III. Coding Schemes 

In this section we describe and compare a number of coding 
schemes for a line network with L links. In the next section 
we will discuss how these schemes can be extended to more 
general settings. 

We will use the configuration in Fig. ^ with L = 2, as the 
illustrating example, and assume for simplicity that e-y = e 2 =: 
e, in which case n\ = n 2 =: n. In all schemes below we will 
use as code Ci over the link AB, a fountain code, such as 
an LT-code or a Raptor code; as demonstrated in [4] and [7], 
these codes are low complexity, rate optimal, adaptable codes 
over erasure channels. Then for each different coding scheme, 
we will specify the code C 2 over the link BC. A summary of 
the properties of all these schemes will be provided in Table U 

A. Complete Decoding and Re-encoding 

An obvious scheme is to use a separate code for each of the 
L links of the line network, and have each intermediate node 
completely decode and re-encode the incoming data. Then it 
is obvious that we can achieve the min-cut capacity by using 
optimal codes (e.g. LT-codes) over each link. However, the 
system suffers a delay of about ke/(l — e) time-slots due to 
each intermediate node. Indeed, at node B, we can directly 
forward the (1 — e)n received coded bits without delay, and 
then, after decoding, create and send an additional en bits over 
the second channel. 

This straightforward scheme imposes low complexity re- 
quirements. We only need 0(k\og(k)) binary operations at 
each intermediate node to decode and re-encode an LT- 
code, and the complete decoding and re-encoding scheme has 
memory requirements of the order O(k). Moreover, LT-codes 
adapt to unknown channels in the sense defined previously. 

B. Systematic Codes 

The complete decoding and re-encoding scheme of the 
previous section is adaptable, rate optimal and has low com- 
plexity. However it requires each intermediate node to store 
in memory the entire k packets of information in order to re- 
encode. We propose a class of coding schemes, which we call 
systematic schemes, which minimize the memory requirement 
at the intermediate nodes, but require the knowledge of the 
erasure probabilities of the links. 

Once again we consider the network in Fig. ^ an d assume 
that we use a fountain code Ci for link AB. In a systematic 
scheme, the intermediate node B first forwards each coded bit 
(packet) from Ci as they are received; these are the systematic 
bits (packets). Meanwhile, B forms (about) ne = j^- linear 
combinations of the systematic bits, which are transmitted in 
the ne time slots following the transmission of the systematic 
bits. Thus all systematic codes will incur an average delay 
of en, and will require en memory elements. The savings 
in memory, as compared to the complete decoding and re- 
encoding, is significant when the erasure probability e is small. 

In a linear network with L links, the same scheme can be 
repeated at each intermediate node. Since the operation at each 
intermediate node is rate-optimal, it follows that for each fixed 



L, the overall end-to-end transmission is also rate-optimal for 
large enough block length k, while each intermediate node 
requires about en memory elements and contributes a delay 

of n/(l - e). 

Below we will discuss a few possible methods to design 
systematic codes. 

1) Fixed Codes: Here we use a fixed systematic code, 
consisting of k systematic bits (packets) and ke/(l — e) parity 
coded bits, to transmit the information over link BC. A 
systematic LT-code [7], or a Tornado code [3], for example, 
can be used to generate the parity bits, and in fact any fixed 
systematic code can be used for this purpose. Although not 
adaptable to unknown channel parameters, these codes have 
very low encoding and decoding complexities. Tornado codes 
for example can be encoded and decoded with 0{n log(l/<5)) 
operations, where S is a constant expressing the (fixed) rate 
penalty. 

2) Sparse Random Codes: In this scheme, the non- 
systematic packets are formed as random (sparse) linear com- 
binations of the systematic ones. More precisely, whenever a 
new packet is received at B, it is added to the storage space 
allocated to each of the non-systematic packets independently 
and with a (small) probability p. 

Theorem 1: With p = (1 + S) \og(ek)/(ek) for S > 0, the 
described systematic random code asymptotically achieves the 
capacity over the channel BC. 

Proof: [Sketch] Suppose k' « fc(l — e) systematic sym- 
bols are received at C, and let I = k — k' sa ek. We will then 
wait for a further I + clog 2 (Z) non-systematic symbols to be 
also received at C, where c > 1 is a constant. After eliminating 
the received systematic symbols, these linear combinations can 
be described by a random (I + clog(Z)) x I binary matrix, 
with i.i.d. entries which are nonzero with probability p = 
(1 + 5) log(efc)/(efc). The results of [1] can be extended to 
show that, if p > log(Z) / Z, the probability that such a matrix is 
not full-rank approaches zero polynomially fast with /. Using 
this and the law of large of numbers then, with high probability 
C can retrieve all the k symbols received at B, -e.g. by 
applying Gaussian elimination to this sparse matrix,- which 
can then be used to decode the fountain code Ci. This code 
can decode the k information symbols from an average of 
k + c\og(ke) received symbols at C, and hence this scheme 
rate optimal for large k. ■ 

The complexity of decoding this code is that of inverting 
the sparse ke x ke matrix, which is 0((ke) 2 log(fce)). In fact, it 
can be shown that 0(log(fc)/fc) is the smallest possible value 
for the probability p, and equivalently the density of the non- 
systematic part of the code, if the code is to be decodable with 
negligible overhead. In that sense, the scheme provided here 
offers the lowest decoding complexity for any such random 
code where the parity bits are chosen as linear combinations 
of the systematic bits with i.i.d. distribution. 

C. Greedy Random Codes 

In this scheme, at each time slot the intermediate node B 
transmits random linear combinations (over GF(2)) of all the 



TABLE I 

Coding schemes that send k bits from the source to the destination. 



Scheme 


Intermed. node complexity 


Delay 


Memory 


Adaptable 


Rate Optimal 


Optimal (Feedback) 





Vk~e/(l~e) 


Vkl 


yes 


yes 


Complete Dec-Reenc 


fclogfc/(l - e) 


ke/(l-e) 


k 


yes 


yes 


Systematic Fixed 


fclog(l/<5)/(l-e) 


ke/(l-e) 


ke 


no 


yes 


Systematic Random 


(ke) 2 log(fee) 


fee/(l-e) 


ke 


no 


yes 


Greedy Random 


k 2 log(fc) 


^/fcelog(fce)/(l - e) 


k 


yes 


yes 



packets it has received thus far. 

The main advantages of this random scheme are its adapt- 
ability and optimality in terms of delay. The drawbacks are 
large memory requirement, and high decoding complexity, 
which is 0(k 2 log A;) XOR operations on packets. 

We will need the following proposition to analyze the 
optimality of greedy random codes. 

Proposition 1: Given a constant c > 1, let A be a 'random 
lower-triangular' (k + clog(fc)) x k binary matrix, where the 
entries Aij are zero for 1 < i < j < k, and all the other 
entries are i.i.d. Bernoulli(l/2) random variables. Then 

Pr [rank(A) < k] < 
Proof: Let K denote the right kernel of A, i.e., 

K := {x £ GF(2) fc \A-x = 0}. 

We will find the expected size of K. Let 

Vi := {x e GF(2) fc \xi = l, and for j < i xj = 0}, 

that is, Vi is the set of vectors which have their first 1- 
components at position i; then are 2 fe ~' such vectors. Let Aj 
denote the jth row of A. Then it is easy to verify that, for any 
x e Vi, the probability that Aj ■ x = is one for j < i, and is 
1/2 for j > i. Therefore the expected size of the intersection 
of V, and K is 

' V ~ 2k c ' 

The sets Vi for % = 1, ■ • ■ , k partition GF(2) fe \{0}, thus the 
expected size of K is 

i=l 

Now the expected size of the kernel can be used to bound the 
probability that A is not full-rank: 

> Pr[rank(yl) = k) + 2Pr[rank(A) < k) 
It follows that Pr[rank(A) < k] < E[\K\] - 1 = ^pbr- 

■ 

An immediate consequence of Proposition [2 is that, if 
the channels were noiseless, i.e., e = 0, then the greedy 
random coding scheme described above is rate optimal; this is 
because, with high probability, node C can perform Gaussian 
elimination on the generator matrix of the code C2, which 



is a random lower-triangular matrix of the type discussed in 
Proposition ^ 

A closer examination of the proof of Proposition ^reveals 
that, in order to make E[|X|] — 1 converge to zero, it is 
sufficient that, for each column i, the matrix A contains at least 
k + c\og(k) —i rows with Bcrnoulli(l/2) random variables at 
the ith position; this will then guarantee that the size of KnVi 
is no more than 1 / k c , for some c > 1, and we use Q to obtain 
the desired result. The interpretation of this statement in the 
context of our coding scheme is that, in order to be able to 
decode with high probability at C, it is sufficient that for each 
i = 1, • • • , k, at least k + c\og(k) — i packets are successfully 
transmitted over BC after B has received the ith coded packet 
from A. 

Let ad and fid denote the number of packets successfully 
transmitted over links AB and BC respectively. Suppose now 
that we end transmission at a time n when C has received 
k + I packets, i.e., (3 n = k + I, where I = o(k) will be 
appropriately chosen. Then the number of packets that will 
be received by C after a time d is equal to k + I — (3d', this, 
we would like to be at least k + clog(fc) — ad- In other 
words, the sufficient conditions above require that at each 
time d = 1 , • ■ ■ , n, the quantity ad — (3d be greater than 
clog(fc) — I. But Xd ■= ad — (3d behaves similar to a symmetric 
one-dimensional random walk: in fact, in 1 — 2e(l — e) 
fraction of the time slots, Xd remains unchanged, while in 
the other 2e(l — e) fraction, it increases or decreases by 1 
with probability 1/2. Therefore, in n w (k + — e) time 
it takes to complete transmission as described above, x^'s 
movements are identical to n' steps of a random walk {?/;}, 
where n! = 2ne(l — e) f=a 2ke. Straightforward calculation 
then shows that with / = 0{^n' log(n')) = 0{^Jke log(fce)), 
the probability that {yi] at any time i < n' goes below 
— I is polynomially small in k. This proves that, with high 
probability, th e k packe ts of information can be retrieved at C 

from k{\ + J e lo ^ fce ^ ) received packets. The overhead goes 
to zero as k becomes large, and hence this coding scheme is 
asymptotically rate optimal. 

IV. General Networks 

In this section, we will represent a communication network 
of binary erasure channels as a directed acyclic graph. 

Assume for simplicity that all edges of the graph have the 
same capacity Cq. Consider a unicast connection; then the 
min-cut capacity between the source and the destination is 



mC'o for some integer m. It is straightforward to see that 
if we are employing a capacity-achieving coding scheme, it 
is sufficient to route the information along m parallel paths 
Pi, • • " j Pm, where each path Pi consists of Li links. We can 
then directly apply the coding schemes previously described 
to each path separately. 

In practice, since coding schemes will employ codewords 
of finite block lengths, there might exist benefits in combining 
independent information streams [6]. Moreover, not all edges 
might be used at the same rate, for example because of cost 
considerations. 

Consider a routing scheme that observes the flow conserva- 
tion principle and utilizes each edge at rate smaller or equal 
to its capacity. Since all the component codes are linear, 
the received symbols along a link I in the network can be 
described using an (n; X kr) matrix, where fc; is the number of 
information symbols sent along the link, and rij is the number 
of received symbols. The point we make in this section is that, 
as long as all such matrices corresponding to the intermediate 
links have full column rank, the end-to-end matrix that the 
receiver will have to decode in order to retrieve the information 
bits, will also be full rank and hence decodable. 

Indeed, given the matrices associated with all individual 
links, to create the end-to-end matrix, we will have to perform 
the following types of matrix operations: 

• Partitioning a matrix into parts, to create the equivalent 
matrix corresponding to splitting an input stream to 
multiple outgoing streams, such as node A in Fig. |2] 

• Multiplication of matrices, in order to create the equiva- 
lent matrix corresponding to serially concatenated chan- 
nels, such as nodes B and C in Fig. |2] 

• Finding the direct sum of matrices, to create the equiv- 
alent matrix corresponding to merging multiple input 
streams of a node into a single outgoing stream, such 
as node D in Fig. [2] 




Fig. 2. Splitting and merging of information in the network. 

All these operations preserve the full-rank property. Thus, all 
coding schemes described in Section|TO]can be directly applied 
over a more general network. However, for this general case, 
a thorough study of the delay and memory requirements for 
each scheme is not provided here. 



V. Comparison with previous work 

In [8] a scheme was proposed that takes advantage of 
intermediate node processing to approach the min-cut capacity. 
The authors model the departures and arrivals at nodes as 
Poisson processes and work out the queuing-theory aspects 
of the problem. The employed coding scheme allows inter- 
mediate nodes to transmit random linear combinations of the 
incoming packets over a finite field GF(q). The transmission 
rate approaches the min-cut capacity as q goes to infinity. This 
scheme, as described in [8], requires 0(k 2 (l — i)) operations 
to encode k symbols at the transmitter, 0(k 3 ) operations for 
decoding at the receiver, and 0(fc 2 (l — i)) operations at each 
intermediate node. Moreover, the operations are over GF(g) 
that are more complex than binary operations. Intermediate 
nodes require storage capabilities for k packets over GF(q). 

The main benefit of the scheme in [8] is in terms of delay as 
we do not decode at each intermediate node. Indeed, complete 
decoding and re-encoding requires a delay of en time-slots. 
However, note that the scheme in [8] achieves the min-cut rate 
for large q, i.e., assuming that we are able to send log 2 (g) bits 
per time-slot instead of one bit per time-slot as we assume. 
Thus in this sense it is not clear that the comparison is fair. 

In fact, the coding scheme employed in [8] can be thought 
as employing the greedy random codes in Section lTlI-CI where 
the linear combinations are performed over GF(g) instead of 
the binary field, and where the encoding matrix is not sparse. 
Thus our results can be viewed as an improvement over the 
coding scheme proposed in [8]. 

VI. Conclusion 

In this paper we have examined the problem of communi- 
cation over a line network, where processing of information at 
the intermediate nodes is required in order to achieve the min- 
cut capacity, and we have included guidelines to extend our 
results to general networks. We have proposed coding schemes 
based on fountain codes. Each scheme has been analyzed and 
evaluated in terms of complexity, delay, memory requirement, 
achievable rate, and adaptability (see Table [j}. In general, 
there is a trade-off between these desirable properties, and 
an absolute best scheme is not claimed. 
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